11/18/2021

Getting Started

Setup

Welcome! While we’re waiting:

  • Navigate to the workshop webpage: https://github.com/dlab-berkeley/Census-Data-in-R

  • Clone or download the workshop files by clicking on the green CODE button.

    • If you download the zipfile, unzip it.

    • Make a note of the folder in which the workshop files reside.

Introduction

  • About me

  • About you

    • Your familiarity with US Census data
    • With geospatial data
    • With geospatial data in R

Outline

  • Describe primary US Census data products

  • Introduce R packages for working with census data

  • Use those packages to fetch census data

  • Use those packages to fetch census data plus census geographic boundary files

  • Make maps of census data

Census Data Overview

US Census Data

The “nation’s leading provider of quality data about its people and economy.”

Available at https://www.census.gov

Primary Census Products

  • Decennial Census

  • American Community Survey (ACS)

Decennial Census

Complete count of the population every 10 years since 1790

Includes data on

  • Population, by sex, age, race/ethnicity, and family / household relationships

  • Housing, by occupancy (occupied, vacant), tenure (owned, rented), and group quarters

From 1840 to 2000, additional questions were asked of a sample of the population.

American Community Survey (ACS)

Since 2005, the American Community Survey (ACS) has replaced the decennial census sample data questions.

  • Annual survey of a sample of about 3.5 million households

  • Provides estimates of demographic, social, economic, and housing characteristics

  • Includes margin of error values for the estimates

Decennial Census* versus ACS Data

Demographic* Social Economic Housing
Sex Families Income Tenure*
Age Education Benefits Occupancy*
Race Marital Status Employment Status Group quarters*
Hispanic Origin Fertility Occupation Housing Value
Relationships Grandparents Industry Taxes & Insurance
Veterans Commuting Utilities
Disability Status Place of Work Mortgage
Language at Home Health Insurance Monthly Rent
Citizenship Structure Type
Mobility

Census Geographies

Census microdata (data collected from individuals) are publicly available at one or more levels of geographic aggregation, or tabulation unit.

  • Tabulation units include states, counties, census tracts, block groups, blocks, etc.

Not all data tables are available all geographies, e.g., only decennial census data are available at the block level.

Census Geographic Tabulation Units

Census Data and Census Geographies

ACS Data Products

ACS 1-year and 5-year products are currently available through 2019

  • New data is released at the end of the next year (2020 data on Dec 8, 2021)

ACS 3-year no longer available (2008—2013)

ACS 5-year data provides much better estimates, lower margins of error

  • More data tables are available for ACS 5-year product

See: Census ACS: Guidance for Data Users

Census Data Workflow

Identify your

  • Topic of interest, e.g., population by age, income, monthly rents, etc…
  • Dataset: Decennial Census or ACS?
  • Year(s): for what time period?
  • Geographic tabulation unit of aggregation (county, tract, etc.)
  • Geographic filter by state(s) or counties

Then determine what specific census variables are available for your topic.

CAUTION

“If you want to measure change you can’t change the measures!”

Census tables, variables, geographies, and geographic boundaries change over time!

Measuring change over time with census data is its own thing, complex, and not covered by this workshop!

Accessing Census Data

Here are three of the primary websites from which you can directly access and download census data:

You can download Census geographic data directly on the census website or from NHGIS.

Census APIs

You can write code to fetch data from the Census Web APIs

  • API: application programming interface

  • Web API: URLs can be formatted to make queries that return data

Or you can leverage an existing R package to make this easier!

  • That’s what we will do!

Only a subset of recent Census data products are available via APIs.

R Packages for Working with Census Data

R Packages for Working with Census Data

tidycensus

An R package with functions that make it easier to fetch decennial census and ACS data from the Census APIs.

Only a limited set of Census data available via tidycensus

  • Decennial census: 1990, 2000, and 2010

  • ACS 1 yr: 2005 through 2019

  • ACS 5 yr: 2005—2009 through 2015—2019 are available.

Actively maintained and expanding to include more census data products (see tidycensus website)

About tidycensus

tidycensus tutorials

tidyverse

The tidyverse package is an umbrella package that installs all the core tidyverse packages and makes them easier to manage and load in R, including:

  • ggplot2, for data visualization
  • dplyr, for data manipulation
  • tidyr, for data tidying
  • readr, for data import
  • purrr, for functional programming
  • tibble, for tibbles, a modern re-imagining of data frames
  • stringr, for strings
  • forcats, for factors

Learn more about the tidyverse at https://www.tidyverse.org.

sf package

Simple features for geospatial data objects and methods.

  • The main R package for working with vector geospatial data
    • vector: locations represented as points, lines and polygons

sf is loaded and used automatically by tidycensus.

The online book Geocomputation with R is a great resourve for learning about the sf package and working with geospatial data in R.

mapview

mapview provides functions for quickly and easily create interactive mapping visualizations for data exploration.

Requesting a Census API key